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TITLE: DEVELOPMENT OF TEACHER-ADMINISTERED TESTS FOR THE SWRL 
READING PROGRAMS 

AUTHORS: Fred C. Niedermeyer and Howard J. Sullivan 



ABSTRACT 



To investigate the type of classroom testing format most appropriate 
for the SWRL Mod 2 Reading Program, three types of teacher-administered 
tests for the SWRL Second-YeaE Communication Skills Program were developed 
and tried out during the 1970-71 school year. The tests were administered 
by the classroom teacher as Criterion Exercises following each unit of 
instruction. Two of the tests were group-admini'stered and had a multiple- 
choice format, with one type consisting of three^choice items and the 
other four-choice items* The third type of test was individually- 
Qj^ administered, constructed-response format. Botli the individually- 
administered, constructed-response tests and the four-choice group- 
administered tests predicted well to an end-of-year criterion test, 
whereas the thre^-choice tests did not. Based on pupil-performance 
data and teacher reactions, the individually administered tests appear 
to be most appropriate for use as Criterion Exercises in the Mod 2 
program. 
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DEVELOPMENT OF TEACHER- ADMINISTERED TESTS FOR THE SWRL READING PROGRAMS 

'»' ^ • 

This report describes the procedures used to develop a classroom 
testing format most api^ropriate for the SWRL Reading Programs. Some of 
the procedures utilized and knowledge gained may be generalizable to 
other objectives-based, teacher-administered instructional programs — 
particularly those concerned with the teaching of reading, 

PROBLEM 

In^ the SWRL Reading Programs, short criterion tests (called 
Criterion Exercises) are administered at two- to three-week intervals 
throughout the year to tell the teacher which children are achieving 
the program outcomes and which children need additional practice and 
remediation. Up .through the 1969-70 school year, these tests had 
always been of a group-administered, selected-response format, with 
children marking their answers after choosing one of three choices for 
each item. 

For some time, however, it had been suspected that these three- 
choice, selected-response Criterion Exercises (CEs) did not adequately 
identify those pupils needing remediation, i.e., the tests were too 
easy. Teachers contended that some children would score quite high 
on the CEs but still could not "read." To determine the extent to 
which this was true, a constructed-response test (where the chil'd 
actually reads rather than selects the word or letter-sound) was 
individually administered to a random sample of 159 children at the end 
of the 1968-69 school year in 20 classes using SWRL's Kindergarten 
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Reading Program. Each child's score on this end-of-year mastery test, 

> 

wa-§ then paired with his average score over all of the CEs taken during 

the year. These pairs were plotted on a scattergram and correlation 

and regression analyses performed. This scattergram appears as Figure 1. 

From Figure 1 two things may be noted. First, the relationship 
between the average CE scores and the scores on the end-of-year mastery 
test was, as might be expected, quite high (r=.80) . More important, 
however, was the fact that a child had to obtain a very high average on 
the CEs (almost 957o) before one could predict that he would score at 
least §0% (a criterion of mastery established by SWRL) on the end-of-year 



additional instruction when. 



test. In fact, 32 of the 159 children (20%) averaged 90% or better on 
the CEs but^till scored beloy the 80% mastery criterion on the end-of- 
year posttest. Thus, not only were the CEs giving the teachers a false 
indication of achievement for^ these children during the year, but the 
high scores on the CEs prevented these children from being assigned 

in fact, remediation was^ required. 
The purpose of the present study was to develop and tryout other 
types of CEs--tests that would hopefully be more accurate indicators of 
how well a child would do at the end of the program. To this end, two 
types of CEs were developed in addition to the already existing three- 
choice, selected-response tests. For one type, a fourth distractor was 
added to each item of the -existing three-choice CEs, thus creating a 
four-choice, selected-response CE. This fourth distractor was selected 
in such a way to make the item more difficult. 
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Figure 1. Scattergram showing relationship between the average CE score and the 
posttest score for 159 pupils in the SWRL Kindergarten .Reading Program. 
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Finally, a different type of CE altogether was developed. This was 
an individually-administered, constructed-response CE, which was similar 
in format to the end-of~y.ear posttest. This type of CE had not been 
deyeloped previously because it was felt that the group-administered, ^ 
selected-response exercises were faster and easier to administer. 

The procedures used and results obtained when developing and 
trying out these various types of CEs with the SWRL First-Grade Reading 
Program are described in the remainder of this report. 

DESCRIPTION OF TESTS I 
This section describes each of the three types of tests developed 
and tried out during this study — the regular three-choice, selected- 
response Criterion Exercises (3-choice CEs); the newly-developed, 
four-choice, selected-response Criterion Exercises (4-choice CEs) and 
the newly-developed, individually-administered, construct ed-response 
Criterion Exercises (constructed-response CEs). 

3 -choice CEs > The 3-choice CEs were already being utilized in the 
SWRL reading programs prior to this ""study. In the SWRL First-Grade 
Reading Program, there were 14 of these tests, one for each of the 14 
^wo-week. instructional units comprising the program. Each CE consisted 
of. 24 items, with eight items for each of the three program outcomes — 
words, wor"d elements and i;ord-attack. The child's response booklet was 
four pages long with six items on a page. (See Figure 2 for a typipal 
page from one of these CEs.) The first two rows of each page tested 
Outcome 1 (Words) , the middle two rows always tested Outcome 2 (Word 
Elements) and the bottom two rows always tested Outcome 3 (Word-Attack). 
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Figure 2, Sample page from one of the 3-choice CEs 
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The te^cher\^s) script for the page shown in Figure 2 would read as 
follows: 

Mark the word I've. 



Rbw. 1 

^..-Row 3 
^Z-"' Row 4 
Row 5 



Mark the word blue l 
Mark vvv. 
Mark ick. 
Mark six I 
Mark f ig > 



' Row 6: 

4-choice CEs . The 4-choice CEs were developed directly from the 
existing 3-choice CEs by simply adding a fourth distractor to each item. 
The fourth distractor was generated in such a way as to make the item 
more difficult. For .JWtJr3\ recognition (Outcome 1, Rows 1 and 2), this 
usually meant adding another available word that possessed some of the 
same letters as tjie test word. For word elements or letter-sound 
correspondences (Outcome 2, Rows 3 and 4), a letter or -letters were 
selected that were similar in cortetruction to the tested element. For 
word- attack (Outcgme 3, Rows 5 and 6), the fourth distractor was 
generated wherever possible by finding a word with the same initial 
and ending letters as the test word, but with a different medial vowel. 
The previous 3-choice CEs had been constructed such that, if possible, 
one of the two distractors had the same beginning sound as the test 

word arid the other the same ending sound. (See Item 6 in Figure 2.) 

. - i 

For example, when "testing the word ^fig , the 3-choice exercises alre^idy 

ha(| the distractors fix and dig . By adding fog as a fourth distractor, 

it seemed likely that this would make it harder for the child to figure 

out'tHe correct answer by eliminating distractors, and would force him 

to simply read the four words to find the test v^ord. It had been felt 

that many children were able to "break the code" on the 3-choice CEs by 
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eliminating distract'ors with different beginning or ending sounds than 
«■ 

the word given orally by the teacher. Figure 3 contains a page from the 
4-choice CEs that corresponds to the 3-choice page in Figure 2. 

Construct ed-response CEs . The constructed-response CEs were formed 
by simply listing the words, word elements, and word-attack words for 
the unit on a 5" x 8" card (Figure 4) . The teacher administering this 
type of test simply asks a child to come up and read through the card, 
while the teacher checks any incorrect responses on a separate score 
sheet. These tests were limited to fifteen items, or five items per 
outcome. Most of the instructional units in the program contained 
fewer than five word elements, and rather than repeat the same letter 
twice (as was done on the selected response CEs) , the five-item block 
for "Word Elements" was simply not filled up. Thus, the number of 
items in each of the 14 constructed-response CEs varied from 11 to 15. 
To prevent teachers from taking undue time to administer this test to 
a child, special instructions were given ^not to "'instruct" a child .when'' 
an error occurred, bjut to go right on and worry about remediation later. 

" TRYOUT PROCEDURES 

Subjects 

3- and 4-choice .CEs . Half of the children in each of five first- 
grade classes at a suburban school in a large metropolitan district were 
randomly assigned to receive either the regular 3-choice CEs or the newly 
developed 4-choice CEs. To* insure that each child always received the 
proper type of CE, names were written on all 14 tests for each child at 
the «beginning of the year. Since the 3-choice and the 4-choice CEs were 




Figure 3. Sample page from one of the 4-choice CEs (corresponds to page from 
3-choice CE in Figure 2.) • ^ 
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Figure 4. Sample constructed-fesponse CE 
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Directions for items 6*10: 
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^identical exdept for the number of distractors (i.e., the answers to be 
marked were the same for either type of test), f.he teacher could administer 
both types at once, using the same set of directions. Appendix A contains 
written procedures given the teachers for administering both the 3-choice 

and the 4- choice CEsA * 

\ 

^ \ . 

Constructed-response CEs , All of the children in five classes in 
two suburban schools at another district were assigned to receive the 
constructed-response CEs during the entire school year. Previous 
tryouts of the reading program indicated that the achievement levels 
obtained at these two schools were fairly comparable to those of the 
school using the 3-choice and 4-choice CRs. Special administration 
and scoring procedures given teachers for the constructed-response CEs 
are contained in Appendix B. ' ^ 
D^ta Source/ 

In addition to the scores on the three types of CEs during the 
year, the primary data source for the study were the scores on a 
56-item, constructed-response mastery test individually administered by 

0 

SWRL personnel in all ten classes at the end of the year. (See Appendix C.) 
The test was administered to a random sample of eight pupils in each of 
^tfie five constructed-response classes and a random sample of 16 pupils 
(eight from the 3-choice CE group and, eight from the 4-choice CE group) 
in each of the five classes using both the 3-choice and the '4-choice CEs. 
This produced a total of 40 posttest scores for each of the three typ6s 
of tests. 
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Class Record Sheets containing scores on the Criterion Exercises 
were also collected from teachers throughout the. year. In addition, the 
completed test booklets themselves were collected from the five clas^ses 
utilizing the 3-choice and 4-.choice, selected-response CEs. Complete 
posttest and CE data were obtained for AO children in the 3-choice CE 
group, 36 children in the 4-choice CE group and 33 children in the 
constructed-response CE grou{). 

Mid-year meetings were held with teachers in all classes to 
obtain their comments, criticisms and suggest ions^regarding the three 
types of tests. 

RESULTS 

^ Pupil Performance Data 

Table 1 indicates the correlation between average CE scores and 
posttest scores for each cf the three groups^ It may be seen that there 
was .a high, positive relationship between CE scqres and the posttest 
scores for all three groups. The question still remains, however, which 
test gives the t-eacher the most valid estimate of how children will 
perform at the end of the year? 

Table 2 shows the mean percentage scores by outcome on the CEs and 
,on the end-of-year posttest for the pujjils in each of the three groups 
receiving the three types of tests. It may be seen that the average 
CE scores for the 3-choice CE ^s and the 4-choiee CE ^s were 90.2% and 
94.0^',, respectively. This represents about a one-item difference on the 
24-item tests. When it is noted, however, that a child need miss only 
one item on an outcome to be assigned a remedial Practice Exercise, 
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Table 1 



4 



CORRELATION COEFFICIENTS (PRODUCT MOMENT) OF RELATIONSHIP BETWEEN 
AVERAGE CE SCORES AND POSTTEST SCORES FOR EACH TYPE OF CE 





3-choice 


4-choice 


Constructed- 




CEs 


CEs 


Response CEs 




(n=40) 


(n=36) 


(n=33) 


Cbrrelation coefficient 








between average'' CE 








score and posttest 




.83* 


.78* 


score 









* p < .01 



Table 2 

MEANS AND STANDARD DEVIATIONS ON CES "AND POSTTEST' 
FOR EACH TYPE «0F CE 





3-choice 

CEs 
(n=40) 
± % sd 


4-choice 

C^s 
(n=36) 
• TT 7, sd 


Constructed- 
Response CEs 
(n=33) 

y % sd 


Mean CE score (%) 


98.2 4.6 


94.0 7.2 


89.8 12.3 


Mean Posttest 


86.1 16.0 


86.3. 15.1 


80.6 18.7 


Score (%) 
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this one-point difference means that about one more Practice Exercise 
per child, or about 30 additional Practice Exercises per class, is 
generated by the 4-choice CEs than by the 3-choice CEs. The mean CE 
score for the constructed-response CE ^s was 89 ♦8%, but it is hard to 
compare this group to the 3-choice and "4-choice CE groups, since the 
constructed-response CE S^s came from a different population than the ^s 
receiving either 3-choice or 4-choice CEs on a random basis. 

The posttest scores in Table 2 reveal very little difference 
between the 3-choice CE and the. 4- choice CE groups, and there is little 
reason to expect much difference. The lower score for the constructed 
response CE group probably reflect;^s the fact that these ^s were drawn 
from a different population than the other- two groups* 

Figures 5, 6 and 7 are scattergrams showing the relationship of 



CE scores to posttest scoj'es 



f or f hd 3>,^:hoice CE group, the 4- choice CE 



\ = 

group and the constructed-re^ponse GE graU{j, respectively. From, the 
scattergram for the 3-choice CE Ss (Figure '5) it may be seen that 11 of 
the 40 Ss averaged higher than 90% on the CEs, yet scored less than 
the mastery criterion of 807o on the posttest. Thus, for 287» of the 
pupils in the 3-choice CE graup, the teacher was receiving indication 
during the year that the children were performing quite well (over 90%), 
yet those children failed to reach mastery on. the end-of-year test. 
With the other two types of Criterion Exercises, however, this did not 
happen. For the 4- choice CE group (Figure 6) and the constructed- 
response CE group (Figure 7), it may be seen th.at only two _Ss in 
either group (6%) averaged over 907» on the CEs, yet scored below the 
mastery level of 80% on the posttest. 



15 



Figure 5. Scattergram showing relationship between the average CE score and the 
posttest score for 40 first-grade pupils receiving the 3-choice CEs. 
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Figure 6. Scattergram showing relationship between the average CE score and the 
posttest score for 36 first-grade pupils receing the 4-choice CEs. 
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Fxgure 7. Scattergram showing the relationship between the average CE score and the 
posttest score for 33 first-grade pupils receiving the contracted-response 
CEs . 
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Results of' Teacher Meetings 

At a meeting of the five teacher utilizing the 3-chbice and the 
4-choice CEs, 'the teachers indicated a preference for the 4-choice CEs. 
From scoring the CEs before returning them to SWRL, it seemed apparent 
to these teachers that the 4-choice CEs generated lower scores and was 
a. more valid measure of reading ability. 

All five teachers utilizing the constructed-response CEs indicated 
that they preferred . these individually-administered te^ts over the 



groupr-administered, selected-response tests, (These teachers were 



familiar with the. latter type of test in that they had used selected- 
response CEs the previous year with the SWRL Reading Program.) Amo^g 
th^ reasons mentioned for this preference of constructed-response CEs 
over selected-response CEs^were the following*: 



. eliminates copying 

• gives the teacher better knowledge of an individual child's 
skill level (because the constructed-response CEs are more 
difficult and b*ecause of ^the one-on-one administration) 

• makes the. child feel good when he receives the individual 
attention of the teacher 



Each of 10 first-grade teachers involved in the study grouped their 
classes into thr^e groups of reading instruction. 



K 




easier to administer 



• takes less time (less than one minute per child, or about 
half the time required to administer, score and record the' 
selected-response CEs to a group of 10 or 12 children) 
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DISCUSSION 

/ 

The regular 3-;:hoice CEs normally used in the-SWRL First-Grade 
Reading Program piroduced a mean score of 98 . 27o across five classrooms. 
Twenty-eight percent o^'f the children receiving this type of test 
averaged above 907o, yet scored below the mastery criterion of 80% on 
an end-of-year test while averaging aboye 907o on the CEs during the 
year. These data, combined with favorable teacher reactions, 
indicate that either the 4-choice CEs or the construct^d-response CEs 
would be more appropriate than the,3-choice CEs for the SWRL Reading 
Programs ♦ 

The highly positive reactions of the teachers utilizing the ^ 
constructed-respon^e CEs were somewhat unexpected. It had been felt that 
a teacher with 30 students would find this type of test too time consuming 
to administer on a fairly frequent basis. As it turned out, however, 
each teacher had grouped the children into three groups for reading 
instruction. Thus, a teacher never had to administer any one CE to 
m8re than one group at -a time (anywhere from six to twelve children). 
Had the classes not been grouped the problem of time may have come up, 
since the teacher would have to have administered the same test to 30 
children at once on an individual basis. 

Thus, when teachers group, as is usually done in first-grade and 
in higher grades, the constructed-response CEs would probably be most 
appropriate in that this type of test actually has the children read, 
rather than select, words and sounds. In kindergarten, , however , teachers 
normally have not grouped for instruction and have kept the entire class 
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together during the year. In this situation, the group-administered, 
4-choice CE would probatly be most efficient, • ^ 

In summary, the results of the study indicate that the group- 
administered, 4-choice tests would be most appropriate in SWRL Heading 
Programs at the kindergarten level and the individually-administered, 
constructed-response CEs would be most appropriate at, first-grade or 
above. The currently used 3- choice CEs do not provide an accurate 
indication of achievement for many of the children and their continued 
use is not recommended. • 



21 



t 



i 



^ APPENDIX .A 

Teacher Procedures for 3-choice and 4-choice CEs 



Criteribn Exercises 



Administration 



.A 



Arrange seating so that the children cannot see each other's answers. 
Be sure each child' s'name is on his booklet. 

Read each item exactly as printed in Directions for the exercise 
(Direction Cards in Program File Box.) 

Help children find the correct row or page only if n&cessary and 

only for the first two exercises. 

Do not provide hints or clues to the right answer. 

Do not give the correct answer after children., have marked the item. 



Scoring 



Enter names on Class Record Sheet. ^ 

Score each booklet using Scoring Key from Directions for the exercise. 
While scoring, make a checkmark beside each item not marked or marked 
incorrectly. * . * 

After ali booklets are\marked, total the first child's score for 
Rows 1 and 2, Rows 3 and 4, and Rows 5 and 6. Enter his score for 
Rows 1 and 2 in Column 1 in the Class Record Sheet, Rows 3 and 4 
in Column 2, and Rows 5 and 6 in Column 3^ 

Enter the child's total score for the three outcomes in the "Total 
Correct" column. 
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Appendix B ^ 

Teacher l^rocedures for Constructed*-response CEs 

PROCEDURES FOR INDIVIDUALLY ADMINISTERED CRITERION EXERCISES 

The procedures below are for use with the individually administered Criterion 
V Exqrcises. They replace tj:ie procedures for group administered Criterion Exercises 
listed'on page 51 of the Teacher's Manual s , . 

Administration and Scoring ^ 

• Place the child so that he won't be distracted by the other children. 

• Locate or write his name on the Criterion Exercise Record Sheet/ 

♦ 

• Place the Criterion Exercise in front of the child so he can see it easily. 

.♦1 * o " 

• Read the directions exactly as they ar^ printed for each of the 15 items. 

• For each item not answered or answered incorrectly, place a check in the 

V. appropriate outcome^ column for the Criterion. Exercise being tested. If the 

child reads all .the items correctly, place a check (^) in the 1007o column. 



Student 


Criterion / 
Exercise 1 < 


Ou 

1 


tcor 
2 


ne^ 
3 


^ 1 
o S 
o / 


i 

. 1. Bobby Jackson 










2n, Nancy Bennett 





















For example, from the Criterion Exercise Record Sheet above, it may be seen 
that Bobby Jackson missed two items on Section 1 (Outcome 1: Words— Items 
1-5) of the Criterion Exercise for Unit 1, none on Section 2 (Outcome 2: 
Word Elements--Items 6-10) and one on Section 3 (Outcome 3: Word Attack-- 
Items 11-15). Nancy Bennett, however, read all' the items correctly. 

• Help the, child find the correct number only if necessary. 
Do not confirm correct answers. 

• Do not provide hints or clues. 

• Do not give the correct answer if the child is incorrect or does not answer. 
Assigning Practice Exercises 

Each child with one or more checks in an outcome column on the Criterion 
Exercise Record Sheet should receive the Practice Exercise corresponding *to 
that outcome. Outcomes 1 (Words), 2 (Word Elements), and 3 (Word Attack) 
correspond to Practice Exercises "a," Vb,'* and "c** respectively. Procedures 
for adMnistering the Practice Exercises are listed on page 52 of the 
Teacher's Manual. 



lit should take only about one minute per child to individually administer 
the exercise. ^ 

^ }Fot some units there are less than 15 items, but this should make no 

bl\lv> difference in the procedures. 
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